tts : add SNAC decoder architecture support for Orpheus TTS #318
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Make sure to read the contributing guidelines before submitting a PR
Summary
This PR adds foundational architecture support for SNAC (Multi-Scale Neural Audio Codec) decoder to enable Orpheus TTS models in llama.cpp. This addresses issue #208.
Note: This PR contains only the architecture infrastructure and does not include model loading, forward pass implementation, or TTS tool integration. It cannot run SNAC models yet but provides the foundation for those components.
Changes
Architecture Registration
LLM_ARCH_SNAC_DECarchitecture enum and registered "snac-dec" namesrc/llama-arch.handsrc/llama-arch.cppTensor Definitions (27 new tensor types)
Decoder tensors:
SNAC_DEC_CONV_IN,SNAC_DEC_CONV_OUTSNAC_DEC_ATTN_NORM,SNAC_DEC_ATTN_Q/K/V/OUTSNAC_DEC_BLK_CONV_UP,SNAC_DEC_BLK_CONV1/2/3,SNAC_DEC_BLK_SNAKE_ALPHAVector quantizer tensors (4 levels):
SNAC_VQ_IN_PROJ,SNAC_VQ_OUT_PROJSNAC_VQ_CODEBOOKEncoder tensors (included for completeness, not needed for TTS inference):
SNAC_ENC_*prefixModel Conversion
Implemented
SnacDecModelclass inconvert_hf_to_gguf.py:_gor_v)codebook_size,decoder_rates,latent_dim,decoder_dimDocumentation
Added
docs/SNAC_IMPLEMENTATION.mdwith:Review Focus Areas
SnacDecModelclass is missing a@ModelBase.register()decorator. Without this, the conversion class won't be invoked. Need to determine the correct HuggingFace architecture name to register.Other items to review:
_gand_vsuffixes is correct for SNAC's weight norm implementationllama-arch.cppmappings - note the use of%dfor block indices vs{bid}in PythonTesting Status
❌ Not tested with actual models yet - this is infrastructure-only
To test after merging:
Next Steps
Remaining work tracked in
docs/SNAC_IMPLEMENTATION.md:@ModelBase.register()decorator toSnacDecModelllama-model.cppllama.cpp(convolutions, Snake activation, attention)References
Link to Devin run: https://app.devin.ai/sessions/f86c58111acb4011894cbaad18a50e62
Requested by: Jake Cosme ([email protected]) (@jakexcosme)